Learnings Options End-to-End for Continuous Action Tasks

نویسندگان

Martin Klissarov

Pierre-Luc Bacon

Jean Harb

Doina Precup

چکیده

We present new results on learning temporally extended actions for continuous tasks, using the options framework (Sutton et al. [1999b], Precup [2000]). In order to achieve this goal we work with the option-critic architecture (Bacon et al. [2017]) using a deliberation cost and train it with proximal policy optimization (Schulman et al. [2017]) instead of vanilla policy gradient. Results on Mujoco domains are promising, but lead to interesting questions about when a given option should be used, an issue directly connected to the use of initiation sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Considering a Model for Sustainable Energy Planning Under Uncertainty

In this paper, real options theory is utilized to evaluate the effect of uncertain electricity and CO2 costs on speculation conduct. Methodologically, the allegiance of the newspaper in this appreciation is that uncertainty is not just stopped down as far as stochastic processes and their fluctuation, additionally as far as expected and acknowledged procedures, i.e. the procedures, w...

متن کامل

The Position of Implicit and Indirect Learning in Ethical Education

The rationalist approach has been dominated on the education environment of values for many years. Unilateral and excessive focus on this approach has revealed ignorance of a vast part of human implicit learning and his intuitive ethical judgments which have been the source of many ethical behaviors. The study intends to answer the question that why only direct and deductive educations about va...

متن کامل

Continuous control with deep reinforcement learning

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic...

متن کامل

A new reduced mathematical model to simulate the action potential in end plate of skeletal muscle fibers

Usually mathematicians use Hodgkin-Huxley model or FitzHug-Nagumo model to simulate action potentials of skeletal muscle fibers. These models are electrically excitable, but skeletal muscle fibers are stimulated chemically. To investigate skeletal muscle fibers we use a model with six ordinary differential equations. This dynamical system is sensitive to initial value of some variables so it is...

متن کامل

Analysis of the Coupled Nonlinear Vibration of a Two-Mass System

This paper presents a fixed-end two-mass system (TMS) with end constraints that permits uncoupled solutions for different masses. The coupled nonlinear models for the present fixed-end TMS were solved using the continuous piecewise linearization method (CPLM) and detailed investigation on the effect of mass-ratio on the TMS response was conducted. The investigations showed that increased mass-r...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1712.00004 شماره

صفحات -

تاریخ انتشار 2017

Learnings Options End-to-End for Continuous Action Tasks

نویسندگان

چکیده

منابع مشابه

Considering a Model for Sustainable Energy Planning Under Uncertainty

The Position of Implicit and Indirect Learning in Ethical Education

Continuous control with deep reinforcement learning

A new reduced mathematical model to simulate the action potential in end plate of skeletal muscle fibers

Analysis of the Coupled Nonlinear Vibration of a Two-Mass System

عنوان ژورنال:

اشتراک گذاری